Search CORE

96 research outputs found

Shoelaces: An interactive tool for ribosome profiling processing and visualization

Author: Birkeland Åsmund
Chyzynska Katarzyna
Valen Eivind
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/01/2020
Field of study

Background: The emergence of ribosome profiling to map actively translating ribosomes has laid the foundation for a diverse range of studies on translational regulation. The data obtained with different variations of this assay is typically manually processed, which has created a need for tools that would streamline and standardize processing steps. Results: We present Shoelaces, a toolkit for ribosome profiling experiments automating read selection and filtering to obtain genuine translating footprints. Based on periodicity, favoring enrichment over the coding regions, it determines the read lengths corresponding to bona fide ribosome protected fragments. The specific codon under translation (P-site) is determined by automatic offset calculations resulting in sub-codon resolution. Shoelaces provides both a user-friendly graphical interface for interactive visualisation in a genome browser-like fashion and a command line interface for integration into automated pipelines. We process 79 libraries and show that studies typically discard excessive amounts of quality data in their manual analysis pipelines. Conclusions: Shoelaces streamlines ribosome profiling analysis offering automation of the processing, a range of interactive visualization features and export of the data into standard formats. Shoelaces stores all processing steps performed in an XML file that can be used by other groups to exactly reproduce the processing of a given study. We therefore anticipate that Shoelaces can aid researchers by automating what is typically performed manually and contribute to the overall reproducibility of studies. The tool is freely distributed as a Python package, with additional instructions, tutorial and demo datasets available at https://bitbucket.org/valenlab/shoelaces.publishedVersio

University of Bergen

REPARATION : ribosome profiling assisted (re-)annotation of bacterial genomes

Author: Giess Adam
Jonckheere Veronique
Menschaert Gerben
Ndah Elvis
Valen Eivind
Van Damme Petra
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation (https://github.com/Biobix/ REPARATION). REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds based on a growth curve model to screen for spurious ORFs. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel (small) ORFs including variants of previously annotated ORFs and >70% of all (variants of) annotated protein coding ORFs were predicted by REPARATION to be translated. Our predictions are supported by matching mass spectrometry proteomics data, sequence composition and conservation analysis. REPARATION is unique in that it makes use of experimental translation evidence to intrinsically perform a de novo ORF delineation in bacterial genomes irrespective of the sequence features linked to open reading frames

Ghent University Academic Bibliography

Ribosome signatures aid bacterial translation initiation site identification

Author: Chyżyńska Katarzyna
Giess Adam
Jonckheere Veronique
Ndah Elvis
Valen Eivind
Van Damme Petra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background: While methods for annotation of genes are increasingly reliable, the exact identification of translation initiation sites remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information, developing a robust method for start site identification is crucial. Ribosome profiling reads show distinct patterns of read length distributions around translation initiation sites. These patterns are typically lost in standard ribosome profiling analysis pipelines, when reads from footprints are adjusted to determine the specific codon being translated. Results: Utilising these signatures in combination with nucleotide sequence information, we build a model capable of predicting translation initiation sites and demonstrate its high accuracy using N-terminal proteomics. Applying this to prokaryotic translatomes, we re-annotate translation initiation sites and provide evidence of N-terminal truncations and extensions of previously annotated coding sequences. These re-annotations are supported by the presence of structural and sequence-based features next to N-terminal peptide evidence. Finally, our model identifies 61 novel genes previously undiscovered in the Salmonella enterica genome. Conclusions: Signatures within ribosome profiling read length distributions can be used in combination with nucleotide sequence information to provide accurate genome-wide identification of translation initiation sites

Ghent University Academic Bibliography

Directory of Open Access Journals

Systematic clustering of transcription start site landscapes

Author: Parker Brian J.
Sandelin Albin
Valen Eivind
Zhao Xiaobei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/12/2015
Field of study

Genome-wide, high-throughput methods for transcription start site (TSS) detection have shown that most promoters have an array of neighboring TSSs where some are used more than others, forming a distribution of initiation propensities. TSS distributions (TSSDs) vary widely between promoters and earlier studies have shown that the TSSDs have biological implications in both regulation and function. However, no systematic study has been made to explore how many types of TSSDs and by extension core promoters exist and to understand which biological features distinguish them. In this study, we developed a new non-parametric dissimilarity measure and clustering approach to explore the similarities and stabilities of clusters of TSSDs. Previous studies have used arbitrary thresholds to arrive at two general classes: broad and sharp. We demonstrated that in addition to the previous broad/sharp dichotomy an additional category of promoters exists. Unlike typical TATA-driven sharp TSSDs where the TSS position can vary a few nucleotides, in this category virtually all TSSs originate from the same genomic position. These promoters lack epigenetic signatures of typical mRNA promoters and a substantial subset of them are mapping upstream of ribosomal protein pseudogenes. We present evidence that these are likely mapping errors, which have confounded earlier analyses, due to the high similarity of ribosomal gene promoters in combination with known G addition bias in the CAGE libraries. Thus, previous two-class separations of promoter based on TSS distributions are motivated, but the ultra-sharp TSS distributions will confound downstream analyses if not removed.This work was supported by a grant from the Novo Nordisk Foundation, http://www.novonordiskfonden.dk/. The European Research Council (http:// erc.europa.eu/) has provided financial support to Dr. Sandelin under the EU 7th Framework Programme (FP7/2007-2013)/ERC grant agreement 204135

The Australian National University

Systematic clustering of transcription start site landscapes

Author: Parker Brian J
Sandelin Albin Gustav
Valen Eivind
Zhao Xiaobei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Directory of Open Access Journals

Copenhagen University Research Information System

PubMed Central

Deep conservation of ribosome stall sites across RNA processing genes

Author: Chyżyńska Katarzyna
Grellscheid Sushma N
Jones Carl
Labun Kornel
Valen Eivind
Publication venue: Oxford University Press
Publication date: 01/01/2021
Field of study

The rate of translation can vary depending on the mRNA template. During the elongation phase the ribosome can transiently pause or permanently stall. A pause can provide the nascent protein with the time to fold or be transported, while stalling can serve as quality control and trigger degradation of aberrant mRNA and peptide. Ribosome profiling has allowed for the genome-wide detection of such pauses and stalls, but due to library-specific biases, these predictions are often unreliable. Here, we take advantage of the deep conservation of protein synthesis machinery, hypothesizing that similar conservation could exist for functionally important locations of ribosome slowdown, here collectively called stall sites. We analyze multiple ribosome profiling datasets from phylogenetically diverse eukaryotes: yeast, fruit fly, zebrafish, mouse and human to identify conserved stall sites. We find thousands of stall sites across multiple species, with the enrichment of proline, glycine and negatively charged amino acids around conserved stalling. Many of the sites are found in RNA processing genes, suggesting that stalling might have a conserved role in RNA metabolism. In summary, our results provide a rich resource for the study of conserved stalling and indicate possible roles of stalling in gene regulation

Durham Research Online

University of Bergen

NORA - Norwegian Open Research Archives

A code for transcription initiation in mammalian genomes

Author: Carninci Piero
Frith Martin C.
Hayashizaki Yoshihide
Krogh Anders
Sandelin Albin
Valen Eivind
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2007
Field of study

Genome-wide detection of transcription start sites (TSSs) has revealed that RNA Polymerase II transcription initiates at millions of positions in mammalian genomes. Most core promoters do not have a single TSS, but an array of closely located TSSs with different rates of initiation. As a rule, genes have more than one such core promoter; however, defining the boundaries between core promoters is not trivial. These discoveries prompt a re-evaluation of our models for transcription initiation. We describe a new framework for understanding the organization of transcription initiation. We show that initiation events are clustered on the chromosomes at multiple scales-clusters within clusters-indicating multiple regulatory processes. Within the smallest of such clusters, which can be interpreted as core promoters, the local DNA sequence predicts the relative transcription start usage of each nucleotide with a remarkable 91% accuracy, implying the existence of a DNA code that determines TSS selection. Conversely, the total expression strength of such clusters is only partially determined by the local DNA sequence. Thus, the overall control of transcription can be understood as a combination of large- and small-scale effects; the selection of transcription start sites is largely governed by the local DNA sequence, whereas the transcriptional activity of a locus is regulated at a different level; it is affected by distal features or events such as enhancers and chromatin remodeling

Crossref

Copenhagen University Research Information System

PubMed Central

University of Queensland eSpace

Small Open Reading Frames, How to Find Them and Determine Their Function

Author: Kute Preeti
Soukarieh Omar
Tjeldnes Håkon
Trégouët David-Alexandre
Valen Eivind Dale
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

Advances in genomics and molecular biology have revealed an abundance of small open reading frames (sORFs) across all types of transcripts. While these sORFs are often assumed to be non-functional, many have been implicated in physiological functions and a significant number of sORFs have been described in human diseases. Thus, sORFs may represent a hidden repository of functional elements that could serve as therapeutic targets. Unlike protein-coding genes, it is not necessarily the encoded peptide of an sORF that enacts its function, sometimes simply the act of translating an sORF might have a regulatory role. Indeed, the most studied sORFs are located in the 5′UTRs of coding transcripts and can have a regulatory impact on the translation of the downstream protein-coding sequence. However, sORFs have also been abundantly identified in non-coding RNAs including lncRNAs, circular RNAs and ribosomal RNAs suggesting that sORFs may be diverse in function. Of the many different experimental methods used to discover sORFs, the most commonly used are ribosome profiling and mass spectrometry. These can confirm interactions between transcripts and ribosomes and the production of a peptide, respectively. Extensions to ribosome profiling, which also capture scanning ribosomes, have further made it possible to see how sORFs impact the translation initiation of mRNAs. While high-throughput techniques have made the identification of sORFs less difficult, defining their function, if any, is typically more challenging. Together, the abundance and potential function of many of these sORFs argues for the necessity of including sORFs in gene annotations and systematically characterizing these to understand their potential functional roles. In this review, we will focus on the high-throughput methods used in the detection and characterization of sORFs and discuss techniques for validation and functional characterization.publishedVersio

University of Bergen

PubMed Central

Recommended from our members

CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing

Author: Church George M.
Cruz José M.
Gagnon James A.
Montague Tessa G.
Valen Eivind
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/01/2015
Field of study

Major advances in genome editing have recently been made possible with the development of the TALEN and CRISPR/Cas9 methods. The speed and ease of implementing these technologies has led to an explosion of mutant and transgenic organisms. A rate-limiting step in efficiently applying TALEN and CRISPR/Cas9 methods is the selection and design of targeting constructs. We have developed an online tool, CHOPCHOP (https://chopchop.rc.fas.harvard.edu), to expedite the design process. CHOPCHOP accepts a wide range of inputs (gene identifiers, genomic regions or pasted sequences) and provides an array of advanced options for target selection. It uses efficient sequence alignment algorithms to minimize search times, and rigorously predicts off-target binding of single-guide RNAs (sgRNAs) and TALENs. Each query produces an interactive visualization of the gene with candidate target sites displayed at their genomic positions and color-coded according to quality scores. In addition, for each possible target site, restriction sites and primer candidates are visualized, facilitating a streamlined pipeline of mutant generation and validation. The ease-of-use and speed of CHOPCHOP make it a valuable tool for genome engineering

Harvard University - DASH

CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editing

Author: Krause Maximilian
Labun Kornel
Montague Tessa G.
Tjeldnes Håkon
Torres Cleuren Yamila Nicole
Valen Eivind
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

The CRISPR–Cas system is a powerful genome editing tool that functions in a diverse array of organisms and cell types. The technology was initially developed to induce targeted mutations in DNA, but CRISPR–Cas has now been adapted to target nucleic acids for a range of purposes. CHOPCHOP is a web tool for identifying CRISPR–Cas single guide RNA (sgRNA) targets. In this major update of CHOPCHOP, we expand our toolbox beyond knockouts. We introduce functionality for targeting RNA with Cas13, which includes support for alternative transcript isoforms and RNA accessibility predictions. We incorporate new DNA targeting modes, including CRISPR activation/repression, targeted enrichment of loci for long-read sequencing, and prediction of Cas9 repair outcomes. Finally, we expand our results page visualization to reveal alternative isoforms and downstream ATG sites, which will aid users in avoiding the expression of truncated proteins. The CHOPCHOP web tool now supports over 200 genomes and we have released a command-line script for running larger jobs and handling unsupported genomes. CHOPCHOP v3 can be found at https://chopchop.cbu.uib.nopublishedVersio

University of Bergen

NORA - Norwegian Open Research Archives